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5 FILAMENTODS HEMAGGLUTININ OF B. pertussis 

CROSS REFERENCE TO RELATED APPLICATIONS 
This application is a continuation-in-part of 
10 Application Serial No* 263r648, filed October 27, 1988, 
which is incorporated herein by reference. 

INTRODUCTION 

15 Technical Field 

This invention relates to the gene encoding 
filamentous hemagglutinin of B. pertussis , the protein 
product and the use of the gene and the product for 
developing vaccines by genetic engineering techniques. 
20 ' ' 

BACKGROUND 

Bordetella pertussis is a small gram negative 
bacillus found only in humans. It is the etiologic 

25 agent of the childhood disease whooping cough, also 
known as pertussis. In susceptible individuals, the 
disease may progress to a serious paroxysmal phase. 
Violent and spasmodic coughing occurs, with the patient 
being subject to secondary injury from the hypoxia and 

30 convulsions attendant with the coughing paroxysms. 
Secondary infections, encephalopathy and death may 
occur. » The discrete molecular moiety that has been 
associated with the severe effects in the paroxysmal 
stage of the disease is pertussis toxin (PTX). PTX has 

35 been reported under a variety of names, including 

lymphocytosis promoting factor, histamine sensitizing 
factor and islet-activating protein. 
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Another protein, filamentous hemagglutinin 
(FHA) is a surface associated protein expressed by B. 
pertussis under the control of a trans- acting vir 

locus • FHA, while poorly characterized, is thought to * 
5 act as a major adhesion and immunodominant antigen in 

the course of human infection. This protein appears as 

a heterogeneous collection of polypeptide species on 

sodium dodecylsulfate-polyacrylamide gel 

electrophoreses, ranging from approximately 60 to 220 
10 kDa (kilodaltons) • It is likely that most of the 

smaller, commonly seen protein gel bands represent 

degradation products of a dominant 220kDa species. 

Electron microscopy of this protein reveals a 

filamentous structure with dimensions of 2nm by 40- 
15 lOOnm. 

It has been suggested that FEk is one of the 
most important factors mediating the bacterial- 
eukaryotic cell adhesive interactions. Furthermore, 
PHA. stimulates an immune response in humans following 
20 clinical disease and acts as an iramunoprotective 

antigen in a model system employing aerosol challenge 
of immunized mice. Although less effective than PTX 
when used alone, FHA and PTX together demonstrate a 
synergistic imraunoprotective effect. 

25 

RELEVANT LITERATDRE 
A description of the B. pertussis 

hemagglutinin protein may be found in Irons et al. , 

Gen. Microbiol . (1983) 129:2769-2778; Aral and Sato, 
30 Biochem. Biophys. Acta (1976) 444:765-782; and Zhang et 

al. , Infect. Immun . (1985) 48:422-427. Physiological 

properties are described by Tuomanen and Weiss, J. 

Infect. Pis . (1985) 152: 118-125; Lenin et al. , FEMS 

Microbiol. Lett . (1986) 37:89-94; Orisu et al.. Infect. 
35 Immun . (1986J 52:695-701; Redd et al., J> Clin. 

Microbiol . (1988) 26:1373-1377; Oda et al. , J. Infect. - 

Dis. (1984) 150:823-833; Robinson and Irons, Infect. 
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Iraraun. (1983) 40:523-528; Sato and Sato, ibid , (1984) 
46:415-421; and Ad Hoc Group for the Study of Pertussis 
Vaccinesr Lancet i (1988) 955-960. 

Cloning of the filamentous hemagglutinin 
5 structural gene or fragment thereof has been reported 
by Brown and Parker, Infect. Immun . (1987) 55:154-161; 
Reiser et al . , Dev. Biol. Stand . (1985) 61:265-271; 
Mattel et al, FEMS Microbiol. Lett . (1986) 36:73-77 ^nd 
Stibitz et El. , J. ncctsricl . 170 :2?0/-?ri2 . 

10 Chemical analysis of the filamentous 

hemagglutinin has been reported by Sato et al . / Infect. 
Immun . (1983) 41:313-320. 



SUMMARY OF THE INVENTION 
15 DNA sequences encoding at least a portion of 

the B. pertussis fhaB gene, genetically engineered 
products including such sequences, the expression 
products of such sequences , and cells containing such 
genetically engineered sequences are provided for use 
20 in the diagnosis ^prophylaxis and therapy of whooping 
cough. 

DESCRIPTION OF THE SPECIFIC EMBODIMENTS 
The subject invention concerns nucleotide 

25 sequences associated with the filamentous hemagglutinin 
protein of B. pertussis and their use in the diagnosis 
prophylaxis and therapy of whooping cough or 
pertussis. The open reading frame is about 10 kbp 
(specifically about 10789 bp) as the sequence set forth 

30 in the experimental section. It encodes a protein of 
about 368 kDa (about 3597 amino acids), comprising an 
N-proxiraal fragment of 230 kDa, which N-proximal 
fragment is divided by proteolysis into two polypeptide 
fragments of about 98 and 140 kDa at an arginine-rich 

35 peptide sequence RRARR, which are the N-terminal and 
C-terminal fragments, respectively. This sequence may 
act as a proteolytic cleavage site. The overall 
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polypeptide is basic, has a relatively high charge 
density, a pK^ of 9.65 and a net charge of -.19. 
Alanine and glycine constitute 27% of the total 
residues, while only 3 upstreams are present. The last 
350 amino acids provide a highly basic region (charge 
^32; pK. 11.3) rich in proline (17%). At amxno acxd 
position 1097 (defined by the start of translation at 
253bp from the left-hand EcoRI site ) and again at 
position 2599 is the tripeptide sequence RGD Thxs 
sequence is Rnown as a "cell recognition sxte" for the 
interaction of f ibronectin and other eukaryotxc extra- 
cellular matrix proteins «ith certain eukaryotic cell 
receptors, particularly mainmals, and nay function in a 
similar manner in FHA mediated bacterial adherence 

The gene appears to be located ad3acent to the 
vir locus, in the direction defined by transcription 
ITapparent regulatory gene fhaA lies about 2-5 kb 
downstream from fhaB, followed by the gene fhaC, also 
believed to be a regulatory gene, again in the 
downstream direction from fhaA. The beginning of the 
ORF is separated by approximately 430 bp from the first 
of the bng genes bnaA. The gene begins at position 253 
from the left at the pDR^ EcoRI site and ends at 
position 11041 with a TAG codon. 

The fhaB gene is characterized by having a 
high GC content, namely about 67.5%. In addition, 
there is a series of tandem direct nucleotide repeats 
of the pattern ABABA in the region from nucleotide 1468 
to nucleotide 1746, with the G of the sequence reported 
30 in the Experimental section being nucleotide 1. An 
unusual alternating repeat (PK)5 begins at residue 
3477. The sequence VEWPRKVET at position 3319 is 
repeated at position 3350. Transcriptional initiation 
appears to occur 70-75 bp upstream of the ORF. 
35 Fragments of the open reading frame of at 

least about 15 bp, more usually at least about 50 bp, 
and usually at least about 100 bp may find use m a 
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variety of ways. The fragments may be used for 
diagnostic purposes, as probes in hybridizing to DNA or 
RNA for detecting the presence of B. pertussis or the 
like* Dse of Southerns, Northerns, dot-blot, or other 
5 techniques may be employed. The fragments may be used 
for encoding peptides of at least about 9 amino acids 
(27 bp) usually at least about 12 amino acids. 

The fragments may also be used in the anti- 
sen«=e direction to modulate the amount of the 
10 expression product of the fha B gene, where such 

modulation may be of interest. Thus, the infectious 
ability of the organism may be modulated and/or 

attenuated by reducing the presence of the filamentous ^ 
hemagglutinin protein on the surface of the organism. 

15 Fragments of interest of the fha B gene include 

those fragments associated with the expression of the 
98 kDa protein and the 230 kDa protein. Dsing the 
numbering as set forth in the sequence provided in this 
application, the fragment for the 98 kDa protein would 

20 terminate between nucleotides 3402 and 3502, usually 
between 3451 and 3474. The 230 kDa protein is 
initiated in that region and terminates at about 
nucleotide 9624. When FEA is originally isolated and 
purified from B. pertussis liquid culture supernatant 

25 using standard techniques there are often 3-4 bands 

seen on SDS-PAGE, with polypeptide species of 230, 140, 
125 and 98 kDa. With increasing time of storage, two 
new species appear, 75 and 58 kDa with concurrent 
fading of the 230 kDa band and intensification of the 

30 125 and 98 kDa bands. An identical N-terminal sequence 
is observed for the 140 and 125 kDa fragments: A-L-R- 
Q-D-F-F-T-P-G-S-V-V-V-R-A-Q-G-N. This peptide is 
encoded begining at position 1074, immediately 
downstream from a proposed proteolytic cleavage site R- 

35 R-A-R-R, and terminating at position 1131. Also of 
interest is the repeat sequence, where the sequence 
should have at least two repeats, preferably three 
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repeats, and the fragment will be at least about 60 
nucleotides, more usually about 100 nucleotides, and 
may be 278 nucleotides or more, usually not exceeding 
about 300 nucleotides of the open reading frame, the 
5 latter encompassing the entire repeat region. The 
repeats do not have perfect homology, but show a high 
degree of conservation. 

Regions of interest will be those encoding 
amino acid sequences 1211 to 1216 (E-A-R-K-D-E) , 1876 

10 to 1881 (R-K-D*E-H-R) and 3075 to 3080 (S-K-Q-D-E-R) , 

and adjoining amino acid sequences, extending up to 100 
amino acids, usually up to 50 amino acids in either 
direction, but particularly including at least 3 amino 
acids of the sequences described above. DNA sequences 

15 of interest may include fragments of 3490 to 3590, 3840 
to 3940, 5840 to 5940, 9440 to 9540, and fragments of 
at least 15bp, more usually at least 25bp thereof. The 
fragment from about 5625 to 5780 does not appear to 
have any features of interest and may be excluded, 

20 unless joined to one of the fragments indicated above* 
Antisera prepared against the B. pertussis FHA 
protein cross-reacts with polypeptide species of - B. 
parapertussis and B. bronchiseptica . Antisera binding 
to the expression products of the regions 2836-3786 nt, 

25 5212-7294 nt and 6393-8080 nt bound to peptides of 

parapertussis , while only the antisera of the first two 
bound to peptides of brochiseptica . 

The subject protein or any portion thereof may 
be prepared in any convenient host, preferably 

30 prokaryotic. By transforming an appropriate host with 
the expression construct, the host will express the 
polypeptide of interest, which may then be isolated or, 
as appropriate, the host may be isolated containing the 
subject protein or portion thereof and used as a 

35 vaccine. 



The expression construct or cassette will 
employ a transcription initiation region^, the 
structural gene for the polypeptide to be expressed, 
and and a transcriptional termination region. The 
transcriptional initiation region may include only the 
RNA polymerase binding site or may also include an 
enhancer or operator to provide for increased 
expression of the subject protein or portion thereof, 
or inducible expression of the subject protein or 
portion thereof. 

A large number of transcription initiation 
regions are known which are active in one or more 
prokaryotic hosts, such as the lambda left or right 
promoters, the lac promoter, the trp promoter, the tac 
promoter, omp promoter, metallothionein promoter, 
etc. The natural promoter may also find use. The 
particular promoter will be chosen to provide for 
efficient expression in accordance with the selection 
of the host cell line. 

For the most part, prokaryotic host cell lines 
will be used to provide for efficient expression of the 
filamentous hemagglutinin or portion thereof, integrity 
of the expression product, ease of isolation of the 
expression product, and in some situations, the ability 
to use the host without isolation of the protein, using 
the transformed host as the vaccine. Various organisms 
may be used which may provide for an immune response 
not only to the subject proteins or portions thereof, 
but also to other pathogens, so that the vaccine will 
result in immune protection, not only against the B. 
pertussis organism but also against disease caused by 
other pathogens. 

Various host organisms which may be used 
include various gram negative organisms, such as E. 
coll. Salmonella , Yersinia , Pseudomonas, Bordetella , 
such as the species avium , bronchiseptica , para- 
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pertussis and pertussis r where the last two are 
particularly preferred. 

A previously indicated sequence analysis of 
the subject protein indicates a guanine plus cytosine 
content considerably higher than that of the 
traditional E.coli cloning host {approximately 50%). 
Therefore, for the most part, the host will desirably 
have a high guanine plus cytosine content in its 
genome, preferably at least 60%, more preferably 65%. 
However, one may use synthetic portions to reduce the 
ratio of guanine and cytosine for use in organisms 
lacking a preference for GC. 

Various replication systems are available for 
use in the various host species. For the most part, 
the vectors will include not only a functional 
replication system but a marker for selecting 
transformants comprising the subject structural gene or 
portion thereof. While it is usually desirable to 
employ either a plasmid or virus which is stably 
maintained as a vector without lysogeny, to enhance the 
efficiency of expression by having a multicopy 
replication system which is stable in the host, this is 
not necessary. Thus, one can transform with bare DNA 
comprising the expression cassette in combination with 
a marker for selection, where the marker may be joined 
to the escpression cassette or be independently present 
in the transformation media. In some situations, a 
vector will be employed which does not have a stable 
replication system for the expression host. In this 
manner, selection can be carried out to insure that 
integration has occurred by selecting for those cells 
containing the marker. 

A wide variety of markers may be used which 
include antibiotic resistance, resistance to heavy 
metals, imparting prototrophy to an auxotrophic host, 
or the like. The particular choice of marker is not 
critical to this invention, but will be selected for 
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efficiency in selection and efficiency in production of 
the subject protein or portion thereof. 

Depending on the manner of transformation, as 
well as the hostr various other functional capabilities 
5 may be provided in the vector. For example, transfer 
capability may be provided which allows for conjugation 
in conjunction with a helper plasmid, where once 
transferred to the recipient host, the vector may no 
longer be transferred to other hosts. For example r the 

10 rlx sequence may be employed, particularly from the P-1 
incompatibility group. In addition, the cos site may 
be employed from bacteriophage lambda. Other markers 
of interest may include a gene which renders an ' 
antibiotic resistant strain sensitive. 

15 The termination region is not critical to this 

invention and any convenient termination region may be 
used. The native termination region may be employed or 
a termination region which is normally associated with 
the transcription initiation region or a different 

20 region. The fact is that many transcription 
termination regions have been employed and are 
generally available and may be used with advantage. 

The host may be transformed in any convenient 
way. By using bare DNA, calcium phosphate precipitated 

25 DNA may be employed for transformation. Alternatively r 
conjugation may be employed using a helper plasmid, 
where a transfer gene is provided in a vector. In some 
instances, it may be desirable to employ a bacterio- 
phage vector, where the host cell will be transduced or 

30 transfected. The technique for introducing the 

expression cassette comprising the subject gene or 
portion thereof is not critical to this invention and 
various alternative protocols find ample exemplifi- 
cation in the literature. 

35 The subject gene may also be subject to 

various lesions or mutations. For example, the 
sequence RRARR may be substituted, deleted, or modified 
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so as to remove the peptidase cleavage site. Thus, the 
protein would be retained substantially intact, with 
the two potential fragments fused together. This 
protein could find a variety of uses. Other mutations 
may include the removal of the upstream portion of the 
gene, so as to leave only the sequence that is 
downstream from the RRARR sequence, where an initiation 
codon may be introduced at the appropriate site. In 
addition, mutagenesis of an RGD region may cause 
altered interactions with eukaryotic target cells and 
perhaps an altered host immune response, both of which 
maV prove useful for disease therapy or prophylaxis. 

Mutation can be achieved in a variety of ways 
using in vitro mutagenesis, primer repair, the 
polymerase chain reaction, restriction site deletions, 
insertions, or the like. The particular manner m 
' which the subject gene is modified is not critical to 
this invention and any conventional technique may be 
en«.loyed which provides for the desired substitutions, 
deletions or insertions. 

The subject gene can be obtained by EcoRI 
digestion of the plasmid pDW21-26. The resulting 10 kb 
ECORI fragment contains the open reading -frame of 9375 
bp. This fragment may be manipulated at its 5' 
25 terminus in a variety of ways. By employing Bal 31 
digestion, the sequence may be resected to remove all 
or a portion of the non-coding region 5' of the 
initiation codon. Alternatively, one may restrict 
either upstream or downstream from the initiation 
codon, where the nucleotides removed by restriction 
downstream from the initiation codon may be replaced 
with an appropriate adapter. In this manner, the 
subject sequence may be inserted into a polylinker 
downstream from a transcriptional initiation regulatory 
35 region and be under the transcriptional initiation 
regulation of such region. 



30 
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The subject compositions, both nucleotides and 
proteins, may find both diagnostic and therapeutic 
use. For diagnostic use, as already indicated, the 
sequences may be used to detect the presence of nucleic 
5 acid sequences which duplex with the subject sequences 
as indicative of the presence of B. pertussis * 
Alternatively, the protein or portion thereof may be 
used in diagnostic assays, as a labeled or unlabeled 
reagent for detection of antibodies to the filamentous 
10 hemagglutinin in a blood sample or the presence of 

filamentous hemagglutinin protein in a blood or tissue 
sample. 

Intact protein or portion thereof may be used / 
to prepare antibodies which may be used in diagnosis, 

15 prophylaxis or therapy. The antibodies may be 
polyclonal or monoclonal, preferably monoclonal. 
Desirably, neutralizing antibodies will be obtained. 
Antibodies may be mouse antibodies, human antibodies, 
chimeric antibodies, e.g., mouse variable region and 

20 the human constant region, or the like. Of particular 
interest are those constant regions which bind to 
complement, such as IgM and igG isotypes. The 
antibodies may be used for passive immunization or for 
treatment in accordance with conventional ways . 

25 The subject compositions also find use as 

vaccines, as the protein, by itself or in combination 
with other proteins, e.g., acellular compositions, as 
cellular compositions in a pertussis or non-pertussis 
host, in purified or semi-purified form or the like. 

30 Desirably, the subject compositions are used in 

conjunction with a modified pertussis toxin, where the 
toxin no longer has ADP-ribosyltransf erase activity, 
particularly subunit A. This can be achieved by using 
ptx3201 as described in Black et al. , Science (1980) 

35 240:656-659. By introducing the subject gene under the 
transcriptional initiation regulatory control of a 
constitutive promoter or an inducible promoter, which 
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is not regulated by the normal pertussis transcrip- 
tional regulation of the filamentous hemagglutinin 
gene/ one can provide for the enhanced presence of the 
subject protein on the surface of the B. pertussis 
cell. In this way, an enhanced immune response may be 
achieved in response to vaccinating either live or dead 
organisms. 

Because of the various ways in which the 
subject composition may be adniinistered/ the amount 
administered will vary widely^ In addition, the amount 
of the vaccine will vary in accordance with the nature 
of the administration, the frequency of the 
administration, the presence or absence of antigen, the 
nature of antigen, or the like 

The manner of administration may be oral, 
peritoneal, subcutaneous, intravascular or the like. 
Usually, an inert carrier is employed, such as sugar, 
water, aqueous ethanol, phosphate buffered saline, 
saline, or the like. Adjuvants include aluminum 
hydroxide, vegetable oils, bacterial toxins, etc. The 
amount of the active ingredient will generally be in 
the range of about 25 to 75 vg/kg for a single human 
dose* Pertussis vaccines have been used previously, 
and prior usage may be used as a guide for the dosage 
employed. See, for example. Developments in Biologieal 
Standardization , supra . 

The following examples are offered by way of 
illustration and not by way of limitation. 

EXPERIMENTAL 

Materials and Methods 

Bacterial Strains and Plasmids . B. pertussis 
strain BP536 is a spontaneously-occurring streptomycin 
resistant mutant of the virulent phase (I) parental 
strain BP338. BP537 is an avirulent phase variant of 
BP536. The isolation of the Tn5 mutant BP353 has been 
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previously described/ Weiss et al.r Infects Inimun > 
(1983) 42:33-41; the transposon insertion site has been 
mapped more recently {Stibitz et al . , 1988, supra ) 
BP33B Tn5-25 carries a Tn5 insertion mutation within 
5 the 2.4 kb BamHI segment of fha B (Stibitz et al. , 1988, 
supra ) . BP-T0X6 (available from R. Rappuoli) is a 
'derivative of BP536 with a deletion of the pertussis 
toxin operon and the substitution of a kanamycin 
resistance cassette at that location. BP-B52 

10 (available from F. Mooi) is a BP536 derivative which 
carries insertion mutations which inactivate the fim2 
and fim3 genes independently. E. coli strains JMIGI 
and SMIO have been described elsewhere (Messing, 
Recomb. DNA Tech. Bull . (1979) 2:43-48; Simon et al . , 

15 Bio/Technology (1983) 1:784-791). Cosmid pDW21-26 is a 
derivative of pHC79 (Hahn and Collins, Gene (1980) 
11:291-298) with an approximately 45 kb insert, 
containing the cloned vir and fha loci from BP338 
(Stibitz, 1988, supra ) . The construction of plasmid 

20 vector pRTPl has been described (Stibitz et al. , Ge^e 
(1986) 50:133-140). 

Cloning of fhaB and Construction of fhaB Deletion 
Mutants. 

25 The filamentous hemagglutinin (FHA) structural 

gene, fhaB , was cloned on a 10 kb Eco RI fragment *from 
cosmid pUW21-26 into the vector pRTPl, creating the 
recombinant plasmid pDRl. An in-frame partial deletion 
of fhaB was constructed by re-ligating a pool of BamH I 

30 partial digests of pDRl. Plasmids were screened for 
the loss of an internal 2.4 kb BamHI fragment. The 
resultant plasmid was designated pDRlOl. 

Bacterial Conjugations and Allelic Exchange 
35 The technique for conjugal transfer of pRTPl 

derivatives from E. coli to B. pertussis has been 
described (Stibitz et al. , 1986, supra ) . The partially 
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deleted copy of fhaB was exchanged for the wild type 
allele in B. pertussis BP536 in two steps. First, the 
E. coli donor, SMlO(pDRlOl) , was mated with a B. 
pertussis recipient, BP536 Tn5-25, which carries a 
selectable marker within the fha B fragment to be 
deleted, Sm^ Ap^ exconjugants were then plated on 
media containing Sm alone and screened for the loss of 
Km resistance, indicating a second crossover event and 
acquisition of the mutant allele. 

DNA Sequencing and Sequence Analysis 

The 10 kb EcoR I fragment containing fhaB was 
subcloned as three separate BamH I fragments as well as 
random one to three kb Sau3A fragments in M13mpl8 and 
M13mpl9 (Yanisch-Perron et al.. Gene (1985) 33:103- 
119), pEMBLlB and -19 (Dente et al. , Nucleic Acids Res . 
(1983) 11:1645-1655), or Bluescript (Stratagene, San 
Diego, CA) vectors. DNA inserts were sequenced by the 
dideoxy chain-termination method (Sanger et al . , Proc. 
Natl. Acad. Sci. USA (1977) 74:5463-5467), using either 
Klenow fragment or Sequenase (U.S. Biochemical 
Corporation, Cleveland, Ohio). Synthetic 
oligonucleotide primers were designed in order to 
extend sequence reading across large cloned inserts. 
Assembly of the nucleotide sequence was performed using 
the software package of the University of Wisconsin 
Genetics Computer Group (Madison, WI). Further 
analysis of the completed nucleotide and predicted 
peptide sequences was performed, using both this 
package as well as PC/GENE (Intelligenetics, Mountain 
View, CA). 

Hemagglutination 

The ability of B. pertussis strains to 
agglutinate sheep erythrocytes was assayed in conical 
pointed-bottom wells of polystyrene Microti ter plates 
(Dynatech Laboratories, Alexandria, VA). The strains 
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were grown for two to three days on Bordet-Gengou 
plates^ washed twice in phosphate-buffered saline, and 
resuspended to an ODgQQ of 10 (1.7x10^° cells/ml). The 
first well of a microtiter plate received lOOvl of this 
5 cell suspension, following which the bacteria were t 
two-fold serially diluted 11 times. Sheep erythrocytes 
were added to each well as 50 yl of a 0.5% PBS-washed 
suspension. The plates were left at room temperature 
for three to four hours during which time nonagglu- 
10 tinaced erythrocyces slid down the Viull bottoias fuiiuing 
a dark pellet. Hemagglutinating (HA) activity was 
expressed as the inverse of the highest dilution 

without significant pellet formation. ^ 

15 Western Immunoblots 

Polyacrylamide gel electrophoresis was 

performed in the presence of sodium dodecylsulfate with 

a 10% separating gel and 20yl of boiled (ODgQo=10) B. 

pertussis cell suspension with sample buffer. Transfer 
20 of protein to nitrocellulose membrane followed the 

procedure of Towbin et al, Proc. Natl. Acad. Sci. USA 

(1979) 76:4350-4354. Non-specific antibody binding to 

the membrane was blocked by pre-incubation with a 

solution of PBS and 1% nonfat dry milk. Immunological 
25 detection of FHA was performed using a 1:1000 dilution 

of a mixture of (1-54, 1-199 r 31E2, 22F10, and 68A6) 

monoclonal anti-FHA antibodies (obtained from F. Mooi), 

followed by incubation with a 1:250 dilution of 

horseradish peroxidase-con jugated goat anti-mouse 
30 antisera. HRP activity was detected- using a 

tetramethylbenzidine-containing reaction mixture. fim2 

and fim3 production were detected using the same 

technique and monoclonal antibodies (21E7 and 8E5) 

specific for these two proteins (obtained from F. 
35 Mooi). 
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Southern Hybridization 

B, pertussis chromosomal DNA was isolated^ 
digested with restriction endonucleases/ and separated 
by agarose gel electrophoresis according to standard 
5 techniques (Maniatis et al, (1982)/ Molecular 

Cloning; A Laboratory Manual ^ Cold Spring Harbor 
Laboratory^ Cold Spring Harborr NY). Transfer of 
fragments to nitrocellulose followed the method of 
Smith and Summers ( Anal, Biochem . (1980) 109:123- 
10 129). Hybridization with probe occurred at 37°C/ with 
50% formamide and SxSSC, Membranes were washed twice 
with 2xSSC at 25<>C, twice with O.lxSSC at 25«»Cr and 
then twice with O.lxSSC at 65^C. 

15 In vitro Bacterial Adherence 

B. pertussis strains were grown on plates for 
two days and then washed twice in phosphate-buffered 
saline (PBS). 20yl of bacterial suspension {ODgQQ=10) 
was added to tissue culture .plate wells containing 200 

20 ]il of MEM and a cover slip on which approximately 5x10^ 
Chinese Hamster Ovary cells had been innoculated and 
allowed to grow overnight. After incubation at 37^C, 
5%C02f for four hours / each well was washed vigorously 
with PBS three tiroes. Any remaining bacteria and CHO 

25 cells were fixed with methanol and then stained with 
Giemsa. All bacterial strains were studied in 
duplicate and all experiments repeated at least 
twice. Bacteria adherent to a single CHO cell were 
counted visually and the mean with standard deviation 

30 determined for each strain. Joint 95% confidence 

intervals were computed based on central limit theorem 
approximations and Bonferoni techniques. 



35 
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Results 

Identification and Cloning of the FHA Structural Gene 
Previous work had led to the isolation of a 
5 cosmid clone, paW21-26, which hybridized with both vir 
and fha DNA probes (Stibitz, et al . , 19B8r supra ) > The 
analysis of Tn5 insertion mutations within this cosmid/ 
using FHA colony and Western immunoblots, had suggested 
that the FHA structural gene, fhaB, was located on a 10 

10 kb EcoRI fragment just to the right of the vir locus. 
Furthermore, fha B traivscription was believed to begin 
near the left-hand EcoRI - site and proceed from left to 
right, based upon the correlation of FHA truncated 
product size with location of the corresponding Tn 5 

15 insertion site. 

Deletion of the internal 2.4 kb BamH I fragment 
of fhaB was performed as described above and the 
mutation returned to the B. pertussis chromosome, 
yielding strain BPIOI. The structure of the resultant 

20 fha B mutant locus in this strain was confirmed by ^ 

Southern blot analysis. The largest FHA cross-reactive 
polypeptide produced by BPlOl measures approximately 
150 kDa, as determined by Western blot technique. This 
truncated FHA product has no hemagglutinating activity. 

25 These data confirmed that the structural gene 

for FHA must be contained on the 10 kb EcoRI insert of 
pDRl. This fragment was, therefore, subcloned for 
dideoxy single-stranded DNA sequencing. 

30 Construction of fhaB fusion proteins 

Seven portions of the fhaB ORF were each 
cloned into the expression vector pEX34. The result in 
each case was a translational fusion with the first 98 
amino acids of the phag MS2 RNA polymerase. Fusion 

35 proteins were expressed in an E. coli host and then 
purified using preparative SDS-PAGE. One reason for 
the construction of these fusion proteins was to 
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confirm the absence of a translational stop codon in 
various regions of the ORF. This aim was addressed by 
comparison of measured fusion protein molecular weights 
with those theoretically expected from translational 
read-through of the entire cloned fhaB inserts. Table 
1 lists the fusion proteins with the nucleotide 
coordinates of the respective fhaB inserts: these data 
confirm the absence of a stop codon in all of these 
fhaB fragments. 



Table 1 



15 



25 



30 



35 





Observed MW 


FRAGMENT 


protein HI 


45 Kda 


BamHI-Rsal 


2836-3786 


protein H2 


85 Kda 


BamHI-NruI 


5212-7294 


protein H3 


77 Kda 


PvuII-PvuII 


6393-8085 


protein H4 


80 Kda 


PvuII-BamHI 


8085-9922 


protein H5 


55 Kda 


StuI-BamHI 


8752-9922 


protein H6 


32 Kda 


EcoRV-BamHI 


9462-9922 


protein H7 


56 Kda 


BaraHl-Clal 


9922-11666 



western immunoblot ana lysis using fusion protein 
antisera 

Antisera to each of the seven fusion proteins 
were prepared by intraperitoneal immunisation of mice 
and were used for two purposes: to correlate each of 
the FHA SDS-PAGE bands with a region of the fhaB ORF, 
and to determine what portions of ORF-encoded poly- 
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peptide are present in whole Bordetella sp. extracts. 
Table 2 shows the results of Western immunoblots using 
each of the seven fusion protein antisera and an FHA 
protein gel pattern. 
5 The combination of these data with the results 

of N-terminal amino acid sequencing suggest an origin 
for the different FHA polypeptide species. The 
stimulation of a murine polyclonal response by each of 
the fhaB fusion proteins also argues that FHA contains 
10 numerous immunogenic domains. 
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Nucleotide S quence of the FHA Structural Gene 

The sequencing strategy described above 
yielded a 10036bp-long nucleotide sequence for the 
Eco RI fragment. Computer analysis identified an open 
5 reading frame (ORP) 10789 bp long beginning at an ATG 
translational start codon 253 bp from the left-hand 
Eco RI site. Two other in-frame ATG codons are located 
45 and 174 bp after the beginning of the ORF; at 
approximately the position of the third ATG codon 

10 begins the use of codons strongly preferred by B. 
pertussis (defined by B. pertussis pertussis toxin 
ope r on codon usage and the UHGCG codon preference 
program; Gribskov et al.. Nucleic Acids Res . (1984) 
12:539-549). The ORF and preferred codon usage end at 

15 a TAG stop codon 11041 bp from the left-hand EcoR I 
site. This ORF encompasses the FHA structural gene 
fhaB ; the sequence of the ORP is shovra below. 

GAATTCCTGCGCTGGCACCCGCGGCGGGCCGGGGAGCGGGTTGTCGGCGCA 5 1 

20 

CGCCTATACGTGCCGGACAGGGTTTGATGGTTTGACTAAGAAATTTCCTAC 102 

AAGTCTTGTATAAATATCCATTGATGGACGGGATCATTACTGACTGACGAA 153 

25 GTGCTGAGGTTTATCCAGACTATGGCACTGGATTTCAAAACCTAAAACGAG 204 

CAGGCCGATAACGGATTCTGCCGATTACTTCACTTCGCTGGTCGGAATATG 255 

Met 

^° AACACGAACCTGTACAGGCTGGTCTTCAGCCATGTTCGCGGCATGCTTGTT 306 

AsnThrAsnLeuTyrArgLeuValPheSerHisValArgGlyMetLeuVal 

CCCGTGAGCGAGCATTGCACCGTCGGAAACACCTTCTGTGGGCGCACGCGT 357 
ProValSerGluHisCysThrValGlyAsnThrPheCysGlyArgThrArg 

35 

GGTCAAGCGCGAAGTGGGGCCCGCGCCACGAGCCTGTCCGTAGCGCCCAAT 408 
GlyGlnAlaArgSerGlyAlaArgAlaThrSerLeuSerValAlaProAsn 
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GCGCTGGCCTGGGCCCTGATGTTGGCGTGTACGGGTCTTCCGTTAGTAACG 459 
AlaLeuAlaTrpAlaLeuMetLeTiAlaCysThrGlyLeuProLeuValThr 

CACGCCCAGGGCTTGGTTCCTCAGGGGCAGACAC AGGTGCTGC AGGGCGGG 510 ' 
5 HisAlaGlnGIyLeuValProGlnGlyGlnTlirGlnValLeuGlnGlyGly 

AACAAGGTTCCCGTTGTCAATATCGCCGACCCAAATTCCGGCGGCGTCTCG 561 * * 

AsnLysValProValValAsnlleAlaAspProAsnSerGlyGlyValSer 

0 CACAACAAGTTCCAGCAGTTCAACGTCGCCAACCCTGGCGTGGTCTTCAAC 612 
HisAsnLysPheGlnGlnPheAsnValAlaAsnProGlyValValPheAsn 

AACGGCCTGACCGACGGCGTGTCCAGGATCGGCGGGGCGCTGACCAAGAAC 6 63 
AsnGlyLeuThrAspGlyValSerArglleGiyGlyAlaLeuThrLysAsn 

5 

CCCAACCTGACTCGCCAGGCCTCGGCCATTCTTGCCGAAGTCACGGACACT 714 
ProAsnLeuThrArgGlnAlaSerAlalleLeuAlaGluValThrAspThr 

TCGCCCAGTCGCCTGGCCGGTACGCTCGAAGTCTATGGCAAGGGCGCCGAC 7 65 
0 SerProSerArgLeuAlaGlyThrLeuGluValTyrGlyLysGlyAlaAsp 

CTCATCATCGCCAACCCCAACGGCATCAGCGTCAACGGCCTGAGCACGCTC 816 
LeuIlelleAlaAsnProAsnGlylleSerValAsnGlylieuSerThrLeu 

-5 AACGCCAGCAACCTGACGCTCACQACGGGGCGTCCCAGCGTCAACGGCGGC 8 67 
AsnAlaSerAsnLeuThrLeuTlurThrGlyArgProSerValAsnGlyGly 

CGCATCGGCCTTGATGTCCAACA6GGCACCGTCACGATCGAACGAGGCGGC 918 
ArgIleGlyLeuAspV9.1GlnGlnGlyTh.rValThrIleGluArg61yGly 

0 

GTC AATGCC ACCGGCCTGGGCTATTTCGACGTGGTGGCGCGCCTGGTCAAG 969 
ValAsnAlaThrGlyLeuGlyTyrPheAspValValAlaArgLeuValLys 



CTGCAGGGTGCCGTGTCGAGCAAGCAGGGCAAGCCCCTGGCCGACATCGCG 1020 
LeuGlnGlyAlaValSerSerLysGlnGlyLysProLeuAlaAspIleAla 
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GTGGTCGCCGGCGCCAACCGGTACGACCACGCAACCCGCCGCGCC ACGCCG 1071 
ValValAlaGlyAlaAsnArgTyrAspHisAlaThrArgArgAlaThrPro 

ATCGCCGCAGGCGCGCGCGGCGCCGCCGCGGGCGCCTACGCGATTG ACGGC 1122 
IleAlaAlaGlyAlaArgGlyAlaAlaAlaGlyAlaTyrAlalleAspGly 

ACGGCGGCGGGCGCCATGTACGGCAAGCACATCACGCTGGTGTCCAGCGAT 1 173 
ThrAlaAlaGlyAlaMetTyrGlyLysHisIleThrLeuValSerSerAsp 

TCAGGCCTGGGCGTGCGCCAGCTCGGCAGCCTGTCCTCGCCATCGGCCATC 1224 
SerGlyLeuGlyValArgGlnLeuGlySerLeuSerSerProSerAlalle 

ACCGTGTCGTCGCAGGGCGAAATCGCGCTGGGCGACGCC ACGGTCC AGCGC 1275 
ThrValSerSerGlnGlyGluIleAleiLeuGlyAspAlaThrValGlnArg 

GGCCCGCTCAGCCTCAAGGGCGCGGGGGTCGTGTCGGCCGGCAAACTGGCC 1326 
GlyProLeuSerLeuLysGlyAlaGlyValValSerAlaGlyLysLeuAla 

TCCGGGGGGGGGGCGGTGAACGTCGCGGGCGGCGGGGCGGTGAAGATCGCG 1377 
SerGlyGlyGlyAlaValAsnValAlaGlyGlyGlyAlaValLys I leAla 

TCGGCCAGCAGCGTTGGAAACCTCGCGGTGCAAGGCGGCGGCAAGGTACAG 1428 
SerAlaSerSerValGlyAsnLeuAlavalGlnGlyGlyGlyLysValGln 

GCCACGCTGTTGAATGCCGGGGGGACGTTGCTGGTGTCGGGCCGCCAGGCC 1479 
AlaThrLeuLeuAsnAlaGlyGlyThrLeuLeuVaiSerGlyArgGlnAla 

GTCCAGCTTGGCGCGGCGAGCAGCCGTCAGGCGCTGTCCGTGAACGCGGGC 1530 
ValGlnLeuGlyAlaAlaSerSerArgGlnAlaLeuSerValAsnAlaGly 

GGCGCCCTCAAGGCGGAC AAGCTGTCGGCGACGCGACGGGTCGACGTGG AT 1581 
GlyAlaLeuLysAlaAspLysLeuSerAlaThrArgArgValAspValAsp 

GGCAAGCAGGCCGTCGCGCTGGGGTCGGCCAGCAGCAATGCGCTGTCGGTG 1 63 2 
GlyLysGlnAlaValAlaLeuGlySerAlaSerSerAsnAlaLeuSerVal 
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CGTGCCGGCGGCGCCCTCAAGGCGGGCAAGCTGTCGGCGACGGGGCGACTG 1683 
ArgAlaGlyGlyAlaLeuLysAlaGlyLysLeuSerAlaThrGlyArgLeu 

GACGTGGACGGCAAGCAGGCCGTCACGCTGGGTTCGGTTGCGAGCGACGGT 1734 
5 AspValAspGlyLysGlnAlaValThrLeuGlySerValAlaSerAspGly 

GCGCTGTCGGTAAGCGCTGGCGGAAACCTGCGGGCGAACGAATTGGTCTCC 1785 
AlaLeuSerValSerAlaGlyGlyAsnLeuArgAlaAsnGluLeuValSer 

10 AGTGCCCAACTTGAGGTGCGTGGGCAGCGGGAGGTCGCGCTGGATGACGCT 1836 
SerAlaGlnLeuGluValArgGlyGlnArgGluValAlaLeuAspAspAla 

TCGAGCGCACGCGGCATGACCGTGGTTGCCGCAGGAGCGCTGGCGGCCCGC 1887 
SerSerAlaArgGlyMetThrValValAlaAlaGlyAlaLeuAlaAlaArg 

15 

AACCTGCAGTCCAAGGGCGCCATCGGCGTACAGGGTGGAGAGGCGGTCAGC 1938 
AsnLeuGlnSerLysGlyAlalleGlyValGlnGlyGlyGluAlaValSer 

GTGGCCAACGCGAACAGCGACGCGGAATTGCGCGTGCGCGGGCGCGGCCAG 1989 
20 valAlaAsnAlaAsnSerAspAlaGluLeuArgValArgGlyArgGlyGln 



GTGGATCTGCACGACCTGAGCGCAGCGCGCGGCGCGGATATCTCCGGCGAG 
ValAspLeuHisAspLeuSerAlaAlaArgGlyAlaAspIleSerGlyGlu 

25 GGGCGCGTCAATATCGGCCGTGCGCGCAGCGATAGCGATGTGAAGGTCTCC 
GlyArgValAsnlleGlyArgAlaArgSerAspSerAspValLysyalSer 

GCGCACGGCGCCTTGTCGATCGATAGCATGACGGCCCTCGGTGCGATCGGC 
AlaHisGlyAlaLeuSerlleAspSerMetThrAlaLeuGlyAlalleGly 



30 



35 



GTCCAGGCAGGCGGCAGCGTGTCGGCCAAGGATATGCGCAGCCGTGGCGCC 
ValGlnAlaGlyGlySerValSerAlaLysAspMetArgSerArgGlyAla 

GTCACCGTCAGCGGCGGCGGCGCCGTCAACCTGGGCGATGTCCAGTCGGAT 
valThrValSerGlyGlyGlyAlaValAsnLeuGlyAspValGlnSerAsp 



2040 
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GGGCAGGTCCGCGCCACCAGCGCGGGCGCCATGACGGTGCGAGACGTCGCG 2295 
GlyGlnValArgAlaThrSerAlaGlyAlaMetThrValArgAspValAla 

GCTGCCGCCGACCTTGCGCTGCAGGCGGGCGACGCGTTGCAGGCCGGGTTC 234 6 
5 AlaAlaAlaAspLeuAlaLeuGlnAlaGlyAspAlaLeuGlnAlaGlyPhe 

CTGAAATCGGCCGGTGCCATGACCGTGAACGGCCGCGATGCCGTGCGACTG 2397 
LeuLysSerAlaGlyAlaMetThrValAsnGlyArgAspAlaValArgLeu 

10 GATGGCGCGCACGCGGGCGGGCAATTGCGGGTTTCCAGCGACGGGCAGGCT 2 4 48 
AspGlyAlaHisAlaGlyGlyGlnLeuArgValSerSerAspGlyGlnAla 

GCGTTGGGCAGTCTCGCGGCCAAGGGCGAGCTGACGGTATCGGCCGCGCGC 2 4 99 
AlaLeuGlySerLeuAlaAlaLysGlyGluLeuThrValSerAlaAlaArg 

15 

GCGGCGACCGTGGCCGAGTTGAAGTCGCTGGACAACATCTCCGTGACGGGC 2550 
AlaAl aThrValAlaGluLeuLy s S er LeiiAspAs n I leS erVa IThrGly 

GGCGAACGCGTGTCGGTTCAGAGCGTCAAC AGCGCGTCC AGGGTCGCCATT 2601 
20 GlyGluArgValSerValGlnSerValAsnSerAlaSerArgValAlalle 

TCGGCGCACGGCGCGCTGGATGTAGGCAAGGTTTCCGCCAAGAGCGGTATC 2652 
SerAlaHisGlyAlaLeuAspValGlyLysValSerAlaLysSerGlylle 

2 5 GGGCTC6AAGGCTGGGGCGCGGTCGGAGCGGACTCCCTCGGTTCCGACGGC 2 7 03 
GlyLeuGluGlyTrpGlyAlaValGlyAlaAspSerLeuGlySerAspGly 

GCGATCAGCGTGTCCGGGCGCGATGCGGTCAGGGTCGATCAAGCCCGCAGT 2754 
AlalleSerValSerGlyArgAspAlaValArgValAspGlnAlaArgSer 

30 

CTTGCCGACATTTCGCTGGGGGCGGAAGGCGGCGCCACGCTGGGCGCGGTG 2805 
LexiAlaAspIleSerLeuGlyAlaGluGlyGlyAlaThrLeuGlyAlaVal 

GAGGCCGCCGGTTCGATCGACGTGCGCGGCGGATCCACGGTGGCGGCGAAC 2856 
GluAlaAlaGlySerlleAspValArgGlyGlySerThrValAlaAlaAsn 
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TCGCTGCACGCCAATCGCGACGTTCGGGTCAGCGGCAAGGATGCGGTGCGC 2907 
SerLeuEisAlaAsnArgAspValArgValSerGlyLysAspAlaValArg 

GTAACGGCCGCCACCAGCGGGGGCGGTCTGCATGTGTCGAGCGGCCGCCAG 2958 
ValThrAlaAlaThrSerGlyGlyGlyLeuHisValSerSerGlyArgGln 

CTCGATCTGGGCGCCGTGCAGGCGCGCGGCGCGCTGGCCCTGGACGGAGGC 3009 
LeuAspLeuGlyAlaValGlnAlaArgGlyAlaLeuSlaLeuAspGlyGly 

GCCGGCGTGGCGCTGCAATCGGCCAAGGCTAGCGGCACGCTGCATGTGCAG 30 60 
AlaGlyValAlaLeuGlnSerAlaLysAlaSerGlyThrLeuHisValGln 

GGCGGCGAGCACCT6GACCTGGGCACGTTGGCCGCCGTAGGGGCGGTGGAC - 3111 
GlyGlyGluHisLeuAspLeuGlyThrLeuAlaAlaValGlyAiaValAsp 

GTCAATGGCACGGGAGACGTGCGCGTTGCGAAGCTGGTGAGCGATGCAGGC 3162 
ValAs nGlyTkr Glyl^pValArgValAlaLy sLeuValS er AspAlaGly 

GCCGATCTGCAAGCGGGGCGCTCCATGACGCTGGGTATCGTCGACACGACG 3213 
AlaAspLeuGlnAlaGlyArgSerMetThrLeuGlylleValAspTiirThr 

GGCGATCTGCAGGCGCGCGCGCAGCAGAAGCTGGAGCTCGGGTCGGTTAAG 3264 
GlyAspLeuGlnAlaArgAlaGlnGlnLysLeuGluLeuGlySerValLys 

AGCGATGGCGGCCTTCAGGCGGCCGCCGGCGGGGCCCTCAGCCTGGCGGCG 3315 
SerAspGlyGlyLeuGlnAlaAlaAlaGlyGlyAlaLeuSerLexoAlaAla 

GCGGAAGTCGCAGGGGCGCTGGAGCTCTCGGGCCAGGGCGTCACCGTGGAC 3366 
AlaGluValAlaGlyAlaLeuGluLeuSerGlyGlnGlyValThrValAsp 

AGAGCCAGCGCTAGCCGGGCACGCATCGACAGCACCGGTTCGGTCGGCATC 3417 
ArgAlaSerAlaSerArgAlaArglleAspSerThrGlySerValGlylle 



GGCGCGCTGAAGGCAGGCGCTGTCGAGGCCGCCTCGCCACGGCGGGCGCGC 
GlyAlaLeuLysAlaGlyAlaValGluAlaAlaSerProArgArgAlaArg 



3468 
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CGCGCGCTGCGGCAGG ATTTCTTC ACGCCCGGCAGCGTGGTGGTCCGCGCC 3519 
ArgAlaLeuArgGlnAspPhePheThrProGlySerValValValArgAla 

CAGGGCAATGTCACGGTCGGGCGCGGCGATCCGC ATCAGGGCGTGCTGGCC 3570 
^ GlnGlyAsnValThrValGlyArgGlyAspProHisGlnGlyValLeuAla 

CAGGGCGACATCATCATGGATGCGAAGGGCGGCACCTTGCTGTTGCGCAAC 3 621 
GlnGlyAspIlelleMetAspAlaLysGlyGlyThrLeuLeuLeuArgAsn 

^° GATGCCTTGACCGAGAACGGGACGGTCACCATATCGGCCGATTCGGCCGTG 
AspAlaLeuThrGluAsnGlyThrValThrlleSerAlaAspSerAlaVal 

CTCGAGCATTCCACCATCGAGAGCAAGATCAGCCAGAGCGTGCTGGCTGCC 
LeuGluHisSerThrlleGluSerLysIleSerGlnSerValLeuAlaAla 

15 

AAAGGGGACAAGGGCAAGCCGGCGGTGTCGGTGAAGGTCGCGAAGAAGCTG 3774 
LysGlyAspLysGlyLysProAlaValSerValLysValAlaLysLysLeu 

TTTCTCAATGGTACGTTGCGGGCCGTCAACGAC AAC AACGAAACC ATGTCC 3825 

20 

PheLeuAsnGlyThrLeuArgAlaValAsnAspAsnAsnGluThrMetSer 

GGGCGCCAGATCGACGTCGTGGACGGACGTCCGCAGATCACCGACGCGGTC 3876 
GlyArgGlnlleAspValValAspGlyArgProGlnlleThrAspAlaVal 

ACGGGCGAAGCGCGTAAGGACGAATCGGTTGTGTCCGACGCCGCGCTCGTG 3927 
ThrGlyGluAlaArgLysAspGluSerValValSerAspAlaAlaLeuVal 

GCCGATGGCGGTCCGATCGTGGTCGAGGCCGGCGAGCTGGTCAGCCATGCC 3978 
AlaAspGlyGlyProIleValValGluAlaGlyGluLeuValSerHisAla 

30 

GGCGGTATCGGCAACGGCCGCAACAAGGAGAATGGCGCC AGCGTCACCGTG 4029 
GlyGlylleGlyAsnGlyArgAsnLysGluAsnGlyAlaSerValThrVal 

CGCACGACTGGCAACCTGGTCAACAAGGGCTACATCTCGGCCGGCAAGCAG 4080 

35 

ArgThrThrGlyAsnLeuValAsnLysGlyTyrlleSerAlaGlyLysGln 



3672 
3723 
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GGCGTGCTCGAGGTGGGCGGCGCCTTGACGAACGAGTTCCTGGTCGGCTCG 4131 
GlyValLeuGluValGlyGlyAlaLeuThrAsnGluPheLeuValGlySer 

GACGGC ACCCAGCGCATCGAGGCGCAGCGCATCG AGAACAGGGGCACCTTC 4182 
AspGlyThrGlnArglleGluAlaGlnArglleGluAsnArgGlyTtirPhe 

CAGAGCCAGGCTCCGGCGGGCACGGCCGGCGCCCTGGTGGTCAAGGCTGCC 4233 
GlnSerGlnAlaProAlaGlyThrAlaGlyAlaLeuValValLysAlaAla 

GAGGCCATCGTGCACGACGGCGTCATGGCCACCAAAGGCGAGATGCAGATC 4284 
GluAlalleValHisAspGlyValMetAlaTiirLysGlyGluMetGlnlle 

GCCGGC AAGGGCGGCGGGTCTCCGACCGTCACCGCCGGCGCAAAGGCGACG 4335 
AlaGlyLysGlyGlyGlySerProTlirVaiThrAlaGlyAlaLysAlaThr 

ACCAGCGCGAACAAGCTGAGCGTCGACGTGGC AAGCTGGGAC AACGCGGGA 4386 
ThrSerAlaAsnLysLeuSerValAspValAlaSerTrpAspAsnAlaGly 

AGCCTGGATATCAAGAAGGGCGGCGCGCAGGTCACGGTG6CCGGGCGCTAT 4437 
SerLeuAspIleLysLysGlyGlyAlaGlnValThrValAlaGlyArgTyr 

GCCGAGCACGGCGAGGTTTCGATACAGGGCGATTACACCGTCTCGGCCGAC 4488 
AlaGluHisGlyGluValSerlleGlnGlyAspTyrThrValSerAlaAsp 

GCCATCGCGCTGGCGGCGGAGGTCACCCAGCGCGGAGGCGCCGCGAACCT6 4539 
AlalleAlaLeiiAlaAlaGlnValThrGInArgGlyGlyAlaAlaAsnLeu 

ACCTCGCGGCACGACACCCGTTTCTCCAACAAGATTCGCCTGATGGGGCCG 4590 
TtxrSerArgHisAspThrArgPheSerAsnLysIleArgLeuMetGlyPro 

TTGCAGGTCAACGCCGGCGGGCCGGTGTCCAATACCGGCAATCTG2VAAGTG 4641 
LeuGlnVaiAsnAlaGlyGlyProValSerAsnThrGlyAsnLeuLysVal 

CGCGAGGGCGTGACCGTAACGGCGGCGTCGTTCGACAACGAGACCGGGGCC 4692 
ArgGluGlyValThrValThrAlaAlaSerPheAspAsnGluThrGlyAla 
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GAGGTCATGGCCAAGAGCGCC ACGCTGACGACTTCCGGGGCCGCGCGC AAC 4743 
GluValMetAlaLysSerAlaThrLeuThrThrSerGlyAlaAlaArgAsn 

GCGGGCAAGATGCAGGTCAAGGAGGCCGCCACGATCGTTGCCGCCAGCGTT 4794 
5 AlaGlyLysMetGlnValLysGluAlaAlaThrlleValAlaAlaSerVal 

TCCAATCCCGGC ACGTTCACGGCCGGCAAGGATATCACTGTTACCTCGCGC 4845 
SerAsnProGlyThrPheThrAlaGlyLysAspIleThrValThrSerArg 

1 0 GGAGGATTCGATAACGAAGGCAAGATGGAGTCCAACIUVGG ACATCGTCATC 4896 
GlyGlyPheAspAsnGluGlyLysMetGluSerAsnLysAspIleVallle 

AAGACGGAACAGTTCAGCAATGGCAGGGTTCTCGACGCCAAGCATGATCTG 4947 
LysThrGluGlnPheSerAsnGlyArgValLeuAspAlaLysHisAspLeu 

15 

ACGGTCACGGCGAGCGGGCAGGCGGACAACCGGGGCAGCCTGAAGGCAGGC 4998 
ThrValThrAlaSerGlyGlnAlaAspAsnArgGlySerLeuLysAlaGly 

C ACGATTTCACGGTGCAGGCCCAGCGTATCGACAATAGCGGAACC ATGGCC 5049 
20 HisAspPheThrValGlnAlaGlnArglleAspAsnSerGlyThrMetAla 

GCCGGCCACGACGCCACGCTGAAGGCGCCGCACCTGCGCAATACGGGCCAG 5100 
AlaGlyHisAspAlaThrLeuLysAlaProHisLeuArgAsnThrGlyGln 

25 GTCGTAGCCGGGCACGACATCCATATCATCAACAGCGCCAAGCTGGAGAAC 5151 
ValValAlaGlyHisAspIleHisIlelleAsnSerAlaLysLeuGluAsn 

ACCGGGCGCGTGGATGCGCGCAACGACATCGCTCTGGATGTGGCGGATTTC 5202 
ThrGlyArgValAspAlaArgAsnAspIleAlaLeuAspValAlaAspPhe 

30 

ACCAACACGGGATCCCTCTACGCCGAGCATGACGCGACGCTGACGCTTGCG 5253 
Tiir As nThrGlySe rLe uTy r AlaGluHi s Asp AlaThrLeuThr LeuAla 



CAAGGCACGCAGCGCGATCTGGTGGTGGACCAGGATCATATCCTGCCGGTG 
GlnGlyThrGlnArgAspLeuValValAspGlnAspHis I leLeuP roVal 
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GCGGAGGGGACGTTACGCGTCAAGGCCAAGTCGCTGIICCACCGAAATCGAG 5355 
AlaGluGlyThrLeuArgValLysAlaLysSerLeuThrThrGluIleGlu 

ACCGGCAATCCCGGCAGCCTGATCGCCGA6GTGCAGGAAAATATCGACAAC 5406 
5 ThrGlyAsnProGlySerLeuIleAlaGluValGlnGluAsnlleAspAsn 

AAGCAGGCCATCGTCGTCGGCAAGGACCTGACGCTGAGTTCGGCGCACGGC 5457 
LysGlnAlalleValValGlyLysAspLeuThrLeuSerSerAlaHisGly 

10 AACGTGGCCAACGAAGCGAACGCGCTGCTGTGGGCCGCCGGGGAGCTGACC 5508 
AsnValAlaAsaGluAlaAsnAlaLeuLeuTrpAlaAlaGlyGluLeuThr 

GAACATCACCAATAAACGGGCCGCGCTGATCGAGGCGGGC 5559 



GTCAAGGCGCA( 

ValLysAlaGlnAsnlleThrAsnLysArgAlaAlaLeuIleGluAlaGly 



15 



GGCAACGCCC6GCTGACGGCGGCCGTTGCCTTGCTCAACAAGCTGGGCCGC 5 610 
GlyAsnAlaArgLeuThrAlaAlaValAlaLeuLeuAsnLysLeuGlyArg 

ATTCGCGCGGGCGAGGACATGCACCTGGATGCGCCGCGCATCGAGAACACC 5661 
20 iieArgAlaGlyGluAspMetHisLeuAspAlaProArglleGluAsnThr 

GCGAAACTGAGCGGCGAGGTGCAACGCAAAGGCGTGCAGGACGTCGGGGGA 5712 
AlaLysLeuSerGlyGluValGlnArgLysGlyValGlnAspValGlyGly 

25 GGCGAGCACGGCCGCTGGAGCGGTATCGGCTATGTCAACTACTGGTTGCGC 57 63 
GlyGluHisGlyArgTrpSerGlylleGlyryrValAsaTyrrrpLeuArg 

GCCGGCAATGGGAAGAAGGCGGGAACCATCGCCGCGCCGTGGTATGGCGGT 5814 
AlaGlyAsnGlyLysLysAlaGlyThrlleAlaAlaProTrpTyrGlyGly 

GATCTGACGGCGGAGCAGTCGCTCATCGAGGTCGGCAAGGATCTCTATCTG 5865 
AspLeuThrAlaGluGlnSerLeuIleGluValGlyLysAspLeuTyrLeu 

AATGCCGGAGCGCGCAAGGACGAACATCGCCATCTGCTCAATGAAGGCGTG 5916 
AsnAlaGlyAlaArgLysAspGluHisArgHisLeuLeuAsnGluGlyVal 



30 
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ATCCAGGCGGGCGGCCATGGCCACATCGGCGGCGACGTGGACAACCGGTCG 5967 
IleGlnAlaGlyGlyHisGlyHisIleGlyGlyAspValAspAsnArgSer 

. GTGGTGCGCACCGTGTCCGCCATGGAGTATTTCAAGACGCCTCTTCCGGTG 6018 
5 ValValArgThrValSerAlaMetGluTyrPheLysThrProLeuProVal 

AGCCTGACTGCCCTGGACAATCGTGCCGGCTTGTCTCCGGCGACCTGGAAC 6069 
SerLeuThrAlaLeuAspAsnArgAlaGlyLeuSerProAlaThrTrpAsn 

10 TTCCAGTCCACGTATGAACTCCTGGATTATCTGCTGGACCAGAATCGCTAC 6120 
PheGlnSerThrTyrGluLeuLeuAspTyrLeuLeuAspGlnAsnArgTyr 

GAGTAC ATTTGGGGGCTGTATCC GACCTAC ACCGAATGGTCGGTGAAT ACG 6171 
GluTyrlleTrpGlyLeuTyrProThrTyrThrGluTrpSerValAsnThr 

15 

CTGAAGAACCTCGACCTGGGCTACCAGGCCAAGCCGGCTCCCACTGCGCCG 6222 
LeuLysAsnLeuAspLeuGlyTyrGlnAlaLysProAlaProThrAlaPro 

CCGATGCCCAAGGCTCCCGAACTCGACCTGCGTGGCCATACGCTGGAGTCG 6273- 
20 proMetProLysAlaProGluLeuAspLeuArgGlyfiisThrLeuGluSer 

GCCGAAGGCCGGAAGATCTTTGGCGAGTACAAGAAGCTGC AAGGCGAGTAC 6324 
AlaGluGlyArgLysIlePheGlyGluTyrLysLysLeuGlnGlyGluTyr 

25 GAGAAGGCGAAGATGGCCGTCCAGGCCGTGGAGGCTTACGGCGAGGCTACT 6375 
GluLysAlaLysMetAlaValGlnAlaValGluAlaTyrGlyGluAlaThr 

CGGCGCGTCCATGATCAGCTGGGCCAACGTTATGGTAAGGCCCTGGGCGGC 6426 
ArgArgValHisAspGlnLeuGlyGlnArgTyrGlyLysAlaLeuGlyGly 

30 

ATGGATGCCGAGACCAAGGAGGTCGACGGCATCATCCAGGAGTTCGCCGCG 6477 
MetAspAlaGluThrLysGluValAspGlyllelleGlnGluPheAlaAla 



GATCTGCGAACGGTCTATGCGAAGCAGGCCGACCAGGCGACCATCGACGCA 
AspLeuArgThrValTyrAlaLysGlnAlaAspGlnAlaThrlleAspAla 
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GAGACGGACAAGGTCGCCCAGCGCTACAAGTCGCAGATCGACGCGGTGCGG 
GluThrAspLy sValAlaGlnArgTyrLy sSerGlnl leAspAlaValArg 

CTGCAGGCGATCCAGCCTGGCCGGGTCACGCTGGCCAAGGCGCTGTCGGCG 
LeuGlnAlalleGlnProGlyArgValThrLeuAlaLysAlaLeuSerAla 

GCGCTGGGCGCCGACTGGCGCGCGCTGGGTCACTCCCAATTGATGCAGCGC 
AlaLeuGlyAlaAspTrpArgAlaLeuGlyHisSerGlnLeuMetGlnArg 

TGGAAGGATTTCAAGGCGGGCAAGCGCGGCGCGGAAATCGCGTTCTATCCC 
TrpLysAspPheLys&laGlyLysArgGlyAlaGluIleAlaPheTyrPro 

AAGGAACAAACCGTGCTGGCCGCCGGCGCCGGTTTGACCCTGTCCAACGGG 
LysGluGlnThrValLeuAlaAlaGlyAlaGlyLeuThrLeuSerAsnGly 

GCGATCCACAACGGCGAGAACGCCGCGCAGAATCGCGGCCGGCCGGAAGGC 
AlalleHisAsnGlyGluAsnAlaAlaGlnAsnArgGlyArgProGluGly 

CTGAAAATCGGCGCaCATTCGGCGACTTCGGTGAGCGGCTCGTTCGACGCC 
LeuLysIleGlyAlaHisSerAlaTlirSerValSerGlySerPheAspAla 

TTGCGCGACGTGGGGCTGGAAAAGCGGCTGGATATCGACGMGCGCTGGCT 
LeuArgAspValGlyLeuGluLysArgLeuAspIleAspAspAlaLeuAla 

GCCGTGCTCGTGRATCCGCATATTTTCACGCGGATCGGGGCGGCTCAGACA 
AlaValLeuValAsnProHisIlePheThrArglleGlyAlaAlaGlaThr 

TCCCTTGCC6ACGGCGCCGCCGGGCCGGCGCTGGCGCGCCAGGCCAGGCAA 
SerLeiiAlaAspGlyAlaAlaGlyProAlaLeuAlaArgGlnAlaArgGln 

GCGCCGGAGACCGACGGCATGGTGGATGCGCGAGGGCTGGGCAGCGCCGAT 
AlaProGluThrAspGlyMetValAspAlaArgGlyLeuGlySerAlaft^p 

GCGCTCGCTTCCCTGGCCAGCTTGGACGCGGCGCAAGGGCTGGAGGTATCC 
AlaLeTiAlaSerLeuAlaSerLeuAspAlaAlaGlnGlyLeuGluValSer 
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GGCAGGCGCAATGCGCAGGTGGCCGACGCCGGGCTCGCCGGGCCGAGCGCC 
GlyArgArgAsnAlaGlnValAlaAspAlaGlyLeuAlaGlyProSerAla 

GTCGCGGCGCCGGCCGTCGGGGCGGCCGATGTCGGCGTGGAGCCTGTCACG 
ValAlaAlaProAlaValGlyAlaAlaAspValGlyValGluProValThr 

GGGGACCAGGTCGACCAGCCTGTCGTGGCGGTCGGGCTCGAGCAGCCTGTC 
GlyAspGlnValAspGlnProValValAlaValGlyLeuGluGlnProVal 

GCGACGGTCCGGGTCGCGCCGCCAGCCGTCGCGTTGCCGCGGCCGCTGTTC 
AlaThrValArgValAlaProProAlaValAlaLeuProArgProLeuPhe 

GAAACCCGCATCAAGTTTATCGACCAGAGCAAATTCTACGGCTCGCGTTAT 
GluThrArglleLysPhelleAspGlnSerLysPheTyrGlySerArgTyr 

TTCTTCGAGCAGATCGGCTACAAGCCCGATCGCGCCGCGCGGGTGGCGGGC 
PhePheGluGlnlleGlyTyrLysProAspArgAlaAlaArgValAlaGly 

GACAACTATTTCGATACCACGCTGGTGCGCGAGCAGGTGCGGCGCGCCCTG 
AspAsnTyrPheAspThrThrLeuValArgGluGlnValArgArgAlaLeu 

GGCGGCTATGAAAGCCGCCTGCCCGTGCGCGGTGTCGCGTTGGTGGCCAAG 
GlyGlyTyrGluSerArgLeuProValArgGlyValAlaLeWalAlaLys 

CTGATGGATTCGGCCGGGACGGTCGGCAAGGCGCTGGGCCTGAAGGTGGGT 
LeuMetAspSerAlaGlyThrValGlyLysAlaLeuGlyLeuLysValGly 

GTCGCGCCGACC6CGCAGCAGCTCAAGCAGGCCGACCGCGATTTCGTCTGG 
ValAlaProThrAlaGlnGlnLeuLysGlnAlaAspArgAspPheValTrp 

TACGTGGATACCGTGATCGACGGCCAGAAGGTTCTCGCTCCCCGGCTGTAC 
TyrValAspThrVallleAspGlyGlnLysValLeuAlaProArgLeuTyr 

CTGACCGAGGCGACGCGCCAGGGCATCACGGATCAGTACGCCGGCGGCGGG 
LeuThrGlxxAlaThrArgGlnGlylleThrAspGlnTyrAlaGlyGlyGly 
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GCGCTGATTGCCTCCGGCGGCGACGT2y.CTGTCAATAGGGACGGCCATGAC 7803 
AlaLeuIleAlaSerGlyGLyAspValThrValAsnThrAspGlyHisAsp 

GTCAGTTCGGTCAACGGGCTGATCCAGGGCAGGAGCGTCAAGGTGGACGCG 7854 

5 valSerSerValAsnGlyLeuIleGlnGlyArgSerValLysValAspAla 

GGCAAGGGCAAGGTCGTGGTGGCCGACAGCAAGGGGGCGGGCGGCGGCATC 7905 
GlyLysGlyLysValValValAlaAspSerLysGlyAlaGlyGlyGlylle 

10 GAGGCCGATGACGAGGTCGACGTCTCAGGCCGGGaTATCGGCATCGAGGGC 7956 
GluAlaAspAspGluValAspValSerGlyArgAspIleGlylleGluGly 

GGCAAGCTGCGCGGCAAGGATGTCAGGCTCAAGGCCGACACGGTCAAGGTC 8007 
GlyLysLeuArgGlyLysAspValArgLeuLysAlaAspThrValLysVal 

GCGACCTCGATGCGTTACGACGACAAGGGCAGGCTGGCGGCGCGCGGCGAC 8058 
AlaThrSerMetArgryrAspAspLysGlyArgLeuAlaAlaArgGlyAsp 

GGCGCCC-TGGATGCGCAAGGCGGCCAGCTGCATATCGAGGCCAAGCGCCTG 8109 
20 GiyAlaLeuAspAlaGlnGlyGlyGlnLeuHisIleGluAlaLysArgLeu 

GAGACGGCCGGCGCGACGCTCAAGGGCGGCAAGGTGAAGCTGGATGTCGAT 8160 
GluThrAlaGlyAlaThrLeuLysGlyGlyLysValLysLeuAspValAsp 



25 



GACGTCAAGTTGGGCGGCGTGTACGAGGCGGGGTCCAGCTACGAGAACAAG 8211 
AspValLysLeuGlyGlyValTyrGluAlaGlySerSerTyrGluAsnLys 

AGCTCGACGCCGCTGGGCAGCCTGTTCGCCATCCTGTCGTCGACGACGGAA 8262 
SerSerThrProLeuGlySerLeuPheAlaXleLeuSerSerThrThrGlu 

ACCAACCAGTCGGCACACGCGAACCATTACGGTACGCGCATCGAAGCCGGT 8313 
ThrAsaGlnSerAlaHisAlaAsnHisTyrGlyriurArglleGluAlaGly 

ACGCTGGAAGGAAAGATGCAGAACCTGGAGATCGAAGGCGGTTCGGTCGAT 8364 
35 ThrLeuGloiGlyLysMetGlnAsnLeuGluIleGluGlyGlySerValAsp 



30 



35 



GCCGCGCATACGGACCTGTCCGTGGCCCGCGACGCGAGGTTCAAGGCCGCC 
AlaAlaHisThrAspLeuSerValAlaArgAspAlaArgPheLysAlaAla 

GCGGATTTCGCGCACGCCGAGCATGAGAAGGATGTGCGCCAACTGTCCCTG 
AlaAspPheAlaHisAlaGluHisGluLysAspValArgGlnLeuSerLeu 

GGTGCCAAGGTGGGGGCGGGCGGCTACGAGGCGGGCTTCAGCCTGGGCAGC 
GlyAlaLysValGlyAlaGlyGlyTyrGluAlaGlyPheSerLeuGlySer 

GAAAGCGGTCTGGAAGCGCACGCCGGCCGCGGTATGACCGCGGGCGCTGAA 
GluSerGlyLeuGlxiAlaHisAlaGlyArgGlyMetThrAlaGlyAlaGlu 

GTCAAGGTAGGTTATCGGGCATCGCACGAACAGTCCTCGGAAACCGAAAAG 
ValLysValGlyTyrArgAlaSerHisGluGlnSerSerGluThrGluLys 

TCCTATCGCAACGCGAACCTCAATTTCGGTGGAGGCTCCGTCGAGGCTGGC 
SerTyrArgA^nAlaAsnLeiiAsnPheGlyGlyGlySerValGlxiAlaGly 

AATGTCCTGGATATCGGCGGPGCCGACATCAACCGGAACCGGTACGGCGGC 
AsnValLexiAspIleGlyGlyAlaAspIleAsnArgAsnArgTyrGlyGly 

GCCGCGAAGGGGAACGCCGGGACCGAGGAGGCCTTGCGCATGCGCGCCAAG 
AlaAlaLysGlyAsnAlaGlyThrGluGluAlaLeuArgMetArgAlaLys 

AAGGTCGAGTCCACCAAGTACGTCAGCGAGCAGACGAGCCAGAGCTCCGGC 
LysValGluSerThrLysTyrValSerGluGlnThrSerGlnSerSerGly 

TGGAGCGTGGAGGTGGCATCGACGGCCAGTGCCCGTTCCAGCCTGCTGACG 
TrpSerValGluValAlaSerThrAlaSerAlaArgSerSerLeuLeuThr 

GCCGCCACGCGCCTGGGCGACAGCGTGGCGCAGAATGTCGAGGACGGCCGC 
AlaAlaThrArgLeuGlyAspSerValAlaGlnAsnValGluAspGlyArg 

GAGATCCGCGGCGAGCTGATGGCTGCGCAAGTCGCCGCGGAGGCCACGCAA 
GluIleArgGlyGluLeuMetAlaAlaGlnValAlaAlaGluAlaThrGln 
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CTGGTAACCGCCGACACGGCGGCGGTGGCACTGAGTGCCGGaATCAGCGCC 
LeuValThrAlaAspThrAlaAlaValAlaLeuSerAlaGlylleSerAla 

GACTTCGACAGCAGCCACAGCCGCTCCACCTCGCAGAATACCC&ATATCTG 
AspPheAspSerSerHisSerArgSerTlirSerGlnAsnThrGlnTyrLeu 

GGCGGAAACTTGTCCATCGAGGCCACCGAGGGCGATGCGACGCTGGTGGGC 
GlyGlyAsnLeuSerlleGluAlaThrGluGlyAspAlaThrLeuValGly 

GCGAAGTTCGGCGGTGGCGACCAGGTCAGCTTGAAGGCAGCGAAGAGCGTG 
AlaLysPtieGlyGlyGlyAspGlnValSerLeuLysAlaAlaLysSerVal 



9027 



9078 



9129 



9180 



AACCTCATGGCGGCCGAATCGACCTTCGAATCGT ACTCGGAGAGCC ACAAC 9231 
AsnLeuMetAlaAlaGluSerThrPheGluSerTyrSerGluSerHisAsn. 



TTCCACGCCTCCGCCGACGCGAACCTTGGCGCCAACGCCGTGCAGGGCGCC 
PheHisAlaSerAlaAspAlaAsnLeuGlyAlaAsnAlaValGlnGlyAla 

GTTGGCCTGGGGTTGACTGCGGGTATGGGGACGTCGCATCAGATTACCAAC 
•ValGlyLeuGlyLeuThrAlaGlyMetGlyTlirSerHisGlnlleThrAsn 

GAAACCGGCAAGACCTATGCCGGAACCTCGGTGGATGCGGCGAACGTGTCG 
GluThrGlyLysTlirTyrAlaGlyThrSerValAspAlaAlaAsnValSer 

ATCGATGCAGGCAAGGATCTGAACCTTTCCGGGTCCCGCGTGCGGGGTAAG 
IleAspAlaGlyLysAspLeuAsnLeuSerGlySerArgValArgGlyLys 

CATGTTGTCCTGGATGTCGAGGGCGATATCAATGCGACCAGCAAGCAGGAT 
HisValValLeuAspValGluGlyAspIleAsnAlaThrSerLysGlnAsp 

GAACGCAACTACAACTCCAGCGGTGGCGGTTGGGACGCCTCGGCAGGGGTG 
GluArgAsnTyrAsnSerSerGlyGlyGlyTrpAspAlaSerAlaGlyVal 

GCGATTCAGAACCGCACGTTGGTTGCGCCCGTGGGGTCTGCCGGCTTCAAT 
AlalleGlnAsnArgThrLeuValAlaProValGlySerAlaGlyPlieAsn 
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TTCAATACGGAACACGACAATTCGCGCCTGACCAATGACGGGGCGGCGGGT 
PheAsnThrGluHisAspAsnSerArgLeuThrAsnAspGlyAlaAlaGly 

GTCGTTGCCAGCGACGGGTTGACGGGCCATGTGAAAGGCGACGCCAACCTG 
ValValAlaSerAspGlyLeuThrGlyHisValLysGlyAspAlaAsnLeu 

ACCGGCGCGACCATTGCCGATTTGTCGGGCAAGGGCAATCTCAAGGTCGAC 
ThrGlyAlaThrlleAlaAspLeuSerGlyLysGlyAsnLeuLysValAsp 

GGCGCGGTCAACGCGCAGAACCTGAAAGACTACCGCGACAAGGACGGCGGC 
GlyAlaValAsnAlaGlnAsnLeuLysAspTyrArgAspLysAspGlyGly 

AGCGGCGGCCTGAACGTGGGCATCTCGTCGACCACGCTGGCGCCCACCGTG 
SerGlyGlyLeuAsnValGlylleSerSerThxThrLeuAlaProThrVal 

GGCGTGGCGTTCGGCAGGGTGGCCGGAGAGGATTATCAGGCCGAGCAGCGC 
GlyValAlaPlxeGlyArgValAlaGlyGltiAspTyrGlnAlaGluGlnArg 

GCCACGATTGACGTCGGTCAAACCAAGGATCCCGCGCGCCTGCAGGTCGGC 
AlaThrlleAspValGlyGlnThrLysAspProAlaArgLeuGlnValGly 

GGCGGCGTCAAGGGTACCCTCAATCAGGACGCCGCGCAGGCCACGGTCGTT 
GlyGlyValLysGlyThrLeuAsnGlnAspAlaAlaGlnAlaThrValVal 

CAGCGCAACAAGCACTGGGCCGGAGGCGGGTCG6AATTCTCGGTG6CTGGC 
GlnArgAsnLysHisTrpAlaGlyGlyGlySerGluPheSerValAlaGly 

AAGTCACTGAAGAAGAAGAACCAGGTCCGCCCGGTGGAGACGCCGACGCCG 
LysSerLeuLysLysLysAsnGlnValArgProValGluThrProThrPro 

GATGTCGTGGATGGACCGCCTAGCCGTCCCACCACGCCGCCCGCGTCGCCG 
AspValValAspGlyProProSerArgProThrThrProProAlaSerPro 

CAGCCGATCCGCGCGACGGTCGAGGTCAGTTCGCCGCCGCCGGTGTCCGTG 
GlnProIleArgAlaTtirValGluValSerSerProProProValSerVal 
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GCCACGGTCGAAGTCGTGCCGCGGCCGAAGGTCGAAACCGGCTCAGCCGCT 10251 
AlaThrValGluValValProArgProLysValGluThrGlySerAlaAla 

TCCGCCTCGGCCGGTGGCGCCCAGGTCGTGCCGGTGACGCCTCCCAAGGTG 10302 
5- serAlaSerAlaGlyGlyiU-aGlnValValProValThrProProLysVal 

GAGGTCGCCAAGGTGGAGGTCGCCJlAGGTGGAAGTCGTGCCGCGGCCGAAG 10353 
GluValAlaLysValGluValAlaLysValGluValValProArgProLys 

10 GTTGAAACGGCTCAGCCGCTTCCGCCCCGGCCGGTGGTGGCCGAGAAGGTG 10404 
ValGluTiirAlaGlnProLeuProProArgProValValAlaGluLysVal 

ACGACGCCGGCGGTCCAGCCCCAGCTTGCCAAGGTGGAGACGCTGCAGCCG 10455 
ThrThrProAlaValGlnProGlnLeuAlaLysValGluThrValGlnPro 

15 

GTGAAGCCCGAAACCACCAAGCCGTTGCCCAAGCCGCTGCCGGTGGCGAAG 10506 
ValLysProGluThrTlirLysProLeuProLysProLeuProValAlaLys 

GTGACGAAAGCGCCGCCGCCGGTTGTGGAGACCGCCCAGCCGCTGCCGCCG' 10557 
20 valThrLysAlaProProProValValGluThrAlaGlnProLeuProPro 

GTCAAGCCACAGAAGGCGACCCCCGGCCCCGTGGCTGAGGTGGGCAAGGCT 10608 
ValLysProGlnlysAlaThrProGlyProValAlaGluValGlyLysAla 

25 ACGGTCACGACGGTGCAGGTGCAGAGTGCGCCGCCCAAGCCGGCCCCGGTG 10659 
ThrValThrThrValGlnValGlnSerAlaProProLysProAlaProVal 

GCCAAGCAGCCCGCGCCTGCACCGAAGCCCAAGCCCAAGCCCAAGCCCAAG 10710 
AlaLysGlnP roAlaP roAlaP roLysP roLy sP roLysP roLysProLy s 

30 

GCCGAGCGTCCGAAGCCGGGCAAAACGACGCCCTTGAGCGGGCGCCACGTG 10761 
AlaGluArgP roLysProGlyLysThrThrP roLeuSerGlyArgHis Val 

GTGCAACAGCAGGTGCAGGTCTTGCAGCGGCAAGCGAGTGACATCAACAAC 10812 
^ ^ ValGlnGlnGlnValGlnValLeuGloArgGlnAlaSerAspIleAsnAsn 
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ACCAAGAGCCTGCCTGGCGGGAAGCTGCCCAAGCCGGTCACCGTGAAGCTG 10863 
ThrLysSerLeuProGlyGlyLysLeuProLysProValThrValLysLeu 

ACCGACGAGAACGGCAAGCCGCAGACGTATACGATCAACCGGCGCGAGGAT 10914 
ThrAspGluAsnGlyLysProGlnThrTyrThrlleAsnArgArgGlxiAsp 



10 



CTGATGAAGCTCAACGGCAAGGTGCTGTCCACCAAGACGACACTGGGCCTG 10965 
LeuMetLysLeuAsnGlyLysValLeuSerThrLysThrThrLeuGlyLeu 

GAGCAGACCTTCCGCCTGCGGTCGAGGATATCGGCGGCAAGAACTACCGGG 11016 
GluGlnThrPheArgLeuArgSerArglleSerAlaAlaArgThrThrGly 



15 



TCTTCTATGAAACCAACAAATAGGTAGTCGCGGCCTGCCGCGGCTCGGCGC 11067 
SerSerMetLysProThrAsnArg 

ATGGGGATTCGCAGGGTTCTCATGCGCCGGCCAATGCCGGATAGCGGTGCA 11118 



ATTGCCGACCATTTCGCGCACCGCGCTCAAGGACGTAGGGTCGACGGCAGG 11169 



20 



CGGGACAGTTTTTGACGTGAAACTGACCGAGTGTCCGCAGGCATTGAATGG 11220 



25 



TCAGCAAGTGGGATTGTTCTTCGAATCTGGTGGCACGGTTGACTATACGTC 11271 



GGGAAACCTGTTTGCGTATCGGGCCGATAGTCAGGGCGTCGAACAGGCTAC 11322 



CGCAGAGCGAAAGCCGAC2UICGTGCAAGCCAATCTGGATGGTTCCGCTATT 11373 



30 



CATTTGGGCCGCAACAAGGGTGCGCAGGCTGCTCAGACGTTTCTGGTATCG 11424 



CAGACGGCTGGGTCGTCGACGTACGGGGCGACCCTGCGCTATCTGGCATGC 11475 



TACATCCGTTCGGGCGCTGGTTCCATTGTTGCGGGGAATCTCCGCAGTCAG 11526 



35 



GTGGGGTTCTCCGTGATGTATCCGTAGCCCGTGAAAGAGGGGTC ACCC ACT 11577 



GCGG6GGGCCCCGGTACG6GATGGTCGGCTTGTCACGAGATTCTTGTTTTC 11628 
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CATTTCTTTCTTTTCACTCGGTCGCAGCGCCGGCTTGfl-TGCATGCAAAGCA 11679 

TCGATAGCTACGAACGGCCGC6ATTCTTGAATCATGAATACATACGCTTGT 11730 

5 

GACGGGGCGCTCGCGAGAGCCGGCCCCAGGGATGGTTTACGCCTGCATTTA 11781 

CGGTAAAGCGGCAAGGCGGCATGGCGCGCTGGCGGCGGCTGGGCGTCGCGG 11832 

CGCTGGGCCATGCTGGCGAGCCTGGCGCCGGCCGCnCGGGCAGCTyGTnAT 11883 
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The relative GC content of the FHA ORF is 
67.5%. Examination of this nucleotide sequence for 
transcriptional start signals indicates possible -35 
and -10 consensus regions/ TGGTTTGAC and TATAAAT^ 
5 separated by 23 base pairs / located 174 and 142 bp 
upstream of the beginning of the ORFf with 
transcriptional initiation beginning apparently to 30 
to 75 bp from the initiation condon. A possible 
ribosomal binding site^ GAGGr occurs SO bp upstceaiu oi 

10 the ORF. Another possible ribosomal binding site, 

CTGGf occurs 11 bp in front of the third ATG. Further 
analysis of the nucleotide sequence reveals a region of 
alternating direct repeats of the pattern, ABABA, 
located between 1468 and 1746 bp from the left hand 

15 EcoRI site. Similar repeats are found in the predicted 
amino acid sequence corresponding to this same region. 

Predicted Peptide Sequence 

The predicted amino acid sequence of the- FHA 

20 ORF is 3597 residues long, with a calculated MW of 368 
kDa. This is substantially larger than published 
measured values. The composition of this sequence is 
alanine and glycine rich (27.0%) and is nearly 
identical to a previously published chemical analysis 

25 of the FHA amino acid composition (Sato et al . , 1983, 
supra ) . The computed isoelectric point of the entire 
polypeptide is 6.79. 

The concentration of charged residues in the 
FHA polypeptide chain is highest between positions 2000 

30 and 2700. Hydrophobicity is highest in the N-terrainal 
300 residues and again at specific locations near 
residues 1800-2000 and 2400-2500. There is a highly 
predicted transmembrane helix between amino acid 
positions 44 and 69 with its transmembrane segment 

35 between residues 52 and 69. 

One interesting feature of the predicted amino 
acid polypeptide is the sequence RRARR located at 
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position 1069* This highly arginine rich sequence is a 
likely site for trypsin-like proteolytic cleavage. 
N-terminal amino acid sequence determinations of 
several of the SDS-PAGE FHA peptide bands by other 
5 workers confirms that cleavage ^ in fact, occurs at this 
location. Analysis of the resultant two parts of the 
FHA peptide sequence demonstrates striking differences 
in chemical properties: The N-terminal 98kDa fragment 
is highly basic with a positive hydropathy score, 
10 whereas the C- terminal 140 kDa portion is a negatively 
charged acidic polypeptide which has a more hydrophilic 
overall composition. Polypeptides of these two sizes 
are dominant species on FHA Western immunoblots. 

15 Cell 'Recognition Site 

Located at amino acid position 1097 and again 
at position 2599 is the tripeptide sequence RGD. This 
sequence is known as a "cell recognition site" for the 
interaction of fibronectin and other eukaryotic 

20 extracellular matrix proteins with the integrin 
receptor family on a variety of eukaryotic cell 
surfaces (Pierschbacher and Ruoslahtif Proc. Hatl. 
Acad. Sci. USA (1984) 81:5985-5988, Ruoslahti and 
Pierschbacher, Science (1987) 238:491-^497). Secondary 

25 structure analysis of the polypeptide sequence adjacent 
to these two FHA RGD sites reveals that the first of 
these is highly predicted to be surface exposed, 
hydrophilic, and antigenic. Comparison of the FHA 
peptide sequence adjacent to this RGD site and the 

30 sequence surrounding the RGD in fibronectin shows 

identity at 7 of the 9 residues. Cleavage at the RRARR 
processing site would leave this first RGD sequence 
close to the N terminius of the 214 kDa polypeptide 
product. 
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In vitro Cell Adherence 

The role of several virulence factors in 
mediating adherence of B. pertussis to Chinese Hamster 
Ovary cells was evaluated. Table 3 indicates the 
findings: 



10 



ADHERENCE OF B. pertussis STRAINS TO CHO CELLS 



15 



20 



25 



Strain 

BP536 
(vir"*") 

BP537 
(vir") 

BPlOl 
(fhaBAlOl) 

BP-B52 
(fim2B52, 
fim3; ;Km) 



BP353 

(fhaA; ;Tn5) ? 

BP-T0X6 + 
(ptxA6) 



Mean adherent 
bacteria per CHO 
cell ± SD (95% 
Pha Pim2 FimBconf idence interval )% Wt 



-363±111 (243-483) 100 
-2.5512.8 (0.71-4.39) 0.7 
-10.8+5.2 (7.67-13.9) 3.0 



.-317+158 (146-488) , 87.3 

-23.4+13.8 (13.3-33.5) 6.4 
-405±102 (303-507) 112 



30 



35 



The results described in the above section 
demonstrate that the gene encoding filamentous 
hemagglutinin of B. pertussis and the expressed gene 
product are now available in intact and modified forms / 
for use in diagnosis, prophylaxis and therapy of 
pertussis. Of particular interest is the use of the 
gene to prepare vaccines, where the protein may be used 
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by itself, as a fragment, as the intact expression 
product of the gene or the physiologically active 
fragment thereof, or in combination with other 
pertussis proteins, particularly with modified 
5 pertussis toxin, or with proteins of other pathogens. 
The subject gene may be used to enhance the amount of 
the filamentous hemagglutinin present in a live or dead 
B. pertussis organism or to provide for the presence of 
the subject proteins in other organisms, where immune 

10 response to more than one antigen is desired* 

All publications and patent applications 
mentioned in this specification are indicative of the 
level of skill of those skilled in the art to which 
this invention pertains. All publications and patent 

15 applications are herein incorporated by reference to 
the same extent as if each individual publication or 
patent application was specifically and individually 
indicated to be incorporated by reference. 

The invention now being fully described, it 

20 will be apparent to one of ordinary skill in the art 

that many changes and modifications can be made thereto 
without departing from the spirit or scope of the 
appended claims. 

25 
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WHAT IS CLAIMED IS: 
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1. A nucleic acid sequence of (1) less than 
about 15kbp encoding the pertussis fha B gene; or (2) 
fragment thereof of at least about ISbp, or (3) 
fragment thereof joined to a nucleic acid sequence from 
other than B^ pertussis ; and free of the fha A gene; and 
other than the sequence from 5625 to 5780 joined to 
other than an adioining E. pert uss is sequence. 

2. A nucleic acid sequence according to 
Claim 1, wherein said sequence does not extend beyond 
the 5' transcriptional and translational control 
sequences and the termination region of said fha B gene, 

3. A nucleic acid sequence according to 
Claim 2, wherein said fragment is free of other nucleic 
acid or is directly joined to a nucleic acid sequence 
from other than B. pertussis . 

4. A nucleic acid sequence according to 
Claim 1/ wherein said sequence is an N-proximal 
sequence extending to at least about the sequence 
encoding RRARR. 

5. A nucleic acid sequence according to 
Claim 1, wherein said sequence is the a-proximal 
sequence extending from about the secjuence encoding 
RRARR, 

6. A nucleic acid sequence according to 
Claim 1, comprising at least one of the sequences 3490 
to 3590, 3840 to 3940, 5840 to 5940, or 9440 to 9540 or 
a fragment of at least 15bp thereof. 
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7. A nucleic acid sequence according to 
Claim 6, wherein said sequence is joined directly or 
indirectly to other than a B. pertussis sequence. 

8. A DNA sequence encoding the B. pertussis 
fhaB gene or a fragment of at least 201 base pairs 
joined to at least one of a promoter or termination 
sequence other than the .natural sequence and free of 

other genes of B. pertussis > 

9. A DNA seqence according to Claim 8, 
comprising at least one of the sequences 3490 to 3590 r 
3840 to 3940, 5840 to 5940, or 9440 to 9540. 

10. A DNA sequence according to Claim 8, 
wherein said sequence is the N--proximal sequence 
extending to at least about the sequence encoding 
KRARR. 

11. A DNA sequence according to Claim 8,' 
wherein said sequence is the C-proximal sequence 
extending from about the sequence encoding RRARR. 

12. A vector comprising a replication system 
functional in a prokaryotic host and a DNA sequence 
according to Claim 8. 

13. A vector according to Claim 12, wherein 
said vector comprises a marker for selection. 

14. A vector according to Claim 12, wherein 
said sequence is joined to a promoter and terminator 
sequence functional for expression in a prokaryotic 
host. 

15. A vector according to Claim 14, wherein 
said promoter is the natural promoter. 
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16. A vector according to Claim 14, wherein 
said promoter is other than the natural promoter. 

17. A transformed prokaryotic cell comprising a 
DNA sequence comprising the fha B gene or fragment 
thereof of at least 15bp joined to other than the fha A 
gene and other than the sequence from 5625 to 5780, as 

p result of _Jr? vitro introduction of paid D^^^t peguenre 
into a precursor prokaryotic cell, and progeny of said 
transformed prokaryotic cell. 

18. A transformed prokaryotic cell according to 
Claim 17/ wherein said precursor cell is B. pertussis . 

19. A method for producing a peptide cross- 
reactive with the filamentous hemagglutinin of B. 
pertussis , said method comprising: 

growing a transformed prokaryotic host 
comprising an fha B expression cassette capable of ' 
expression in said host, whereby said peptide is 
expressed. 

20. A method according to Claim 19, wherein 
said peptide comprises a fragment of at least 9 amino 
acids of said filamentous hemagglutinin. 

21. A method according to Claim 20, wherein 
said peptide is the C-terminal portion of the fha B 
gene. 

22. A vaccine comprising a peptide cross- 
reactive with the filamentous hemagglutinin of B. 
pertussis prepared by the method comprising: 

growing a transformed prokaryotic host 
comprising an fha B expression cassette capable of 
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expression in said host, whereby said peptide is 
expressed. 

23. A vaccine according to Claim 22, further 
5 comprising a peptide cross-reactive with B. pertussis 

endotoxin. 

24. A vaccine according to Claim 22, wherein 
said peptide is cross-reactive with the A subunit. 
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